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Objective. A recent high-density fine-mapping 
(ImmunoChip) study of genetic associations in rheuma- 
toid arthritis (RA) identified 14 risk loci with validated 
genome-wide significance, as well as a number of loci 
showing associations suggestive of significance (P = 5x 
10 _s < 5 x 10 -8 ), but these have yet to be replicated. 
The aim of this study was to determine whether these 
potentially significant loci are involved in the pathogen- 
esis of RA, and to explore whether any of the loci are 
associated with a specific RA serotype. 

Methods. A total of 16 single-nucleotide polymor- 
phisms (SNPs) were selected for genotyping and asso- 
ciation analyses in 2 independent validation cohorts, 
comprising 6,106 RA cases and 4,290 controls. A meta- 
analysis of the data from the original ImmunoChip 
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discovery cohort and from both validation cohorts was 
carried out, for a combined total of 17,581 RA cases and 
20,160 controls. In addition, stratified analysis of pa- 
tient subsets, defined according to their anti-cyclic 
citrullinated peptide (anti-CCP) antibody status, was 
performed. 

Results. A significant association with RA risk 
(P < 0.05) was replicated for 6 of the SNPs assessed in 
the validation cohorts. All SNPs in the validation study 
had odds ratios (ORs) for RA susceptibility in the same 
direction as those in the ImmunoChip discovery study. 
One SNP, rs72928038, mapping to an intron of BACH2, 



Research Centre, Nuffield Orthopaedic Centre, Oxford, UK; 6 A. G. 
Wilson, MB, PhD, FRCP, DCH: University of Sheffield, Sheffield, 
UK; 7 Ann W. Morgan, PhD, FRCP: University of Leeds, Leeds, UK; 
8 Joel M. Kremer, MD: Albany Medical College and the Center for 
Rheumatology, Albany, and Consortium of Rheumatology Research- 
ers of North America, New York, New York; 9 Dimitrios Pappas, MD, 
MPH: Columbia University, and Consortium of Rheumatology Re- 
searchers of North America, New York, New York; 10 Peter Gregersen, 
MD: The Feinstein Institute for Medical Research and North Shore- 
LIJ Health System, Manhasset, New York; 11 Lars Klareskog, MD, 
PhD: Karolinska Institute, Stockholm, Sweden; 12 Jeffrey Greenberg, 
MD, MPH: New York University and New York University Hospital 
for Joint Diseases, New York, New York. 

Dr. Kremer has received consulting fees, speaking fees, 
and/or honoraria from AbbVie, Bristol-Myers Squibb, Genentech, 
Lilly, and Pfizer (less than $10,000 each) and is an employee of the 
Consortium of Rheumatology Researchers of North America 
(CORRONA). Dr. Pappas has received consulting fees, speaking fees, 
and/or honoraria from Novartis (less than $10,000) and is an employee 
of CORRONA. Dr. Barton has received consulting fees, speaking fees, 
and/or honoraria from Eli Lilly and Pfizer (less than $10,000 each). 

Address correspondence to Stephen Eyre, PhD, University of 
Manchester, Arthritis Research UK Epidemiology Unit, Centre for 
Musculoskeletal Research, Manchester Academic Health Science 
Centre, Stopford Building, Oxford Road, Manchester M13 9PT, UK. 
E-mail : steve .eyre @ manchester. ac.uk. 

Submitted for publication May 29, 2013; accepted in revised 
form August 29, 2013. 



3058 



ASSOCIATION OF BACH2 AND RAD51B WITH RA 



3059 



achieved genome-wide significance in the meta-analysis 
(P = 1.2 x 10" 8 , OR 1.12), and a second SNP, rs911263, 
mapping to an intron of RAD51B, was significantly 
associated in the anti-CCP-positive RA subgroup (P = 
4 x 10~ 8 , OR 0.89), confirming that both are RA 
susceptibility loci. 

Conclusion. This study provides robust evidence 
for an association of RA susceptibility with genes in- 
volved in B cell differentiation (BACH2) and DNA 
repair (RAD51B). The finding that the RAD51B gene 
exhibited different associations based on serologic sub- 
type adds to the expanding knowledge base in defining 
subgroups of RA. 

Rheumatoid arthritis (RA) is a complex, chronic 
autoimmune disease that affects —1% of the adult 
population worldwide (1). In addition to inflammation 
of the synovial joints, RA is characterized by systemic 
inflammation and the presence of serum autoantibodies 
against citrullinated peptides (anti-citrullinated protein 
antibodies [ACPAs]), as defined by positive findings on 
the anti-cyclic citrullinated peptide (anti-CCP) antibody 
test (2). Genome-wide association studies have been 
successful in determining many loci associated with 
complex diseases, including RA (3). The utility of sus- 
ceptibility variants within single genetic loci, in isolation, 
is likely to be limited, with evidence emerging that 
linking multiple associated genes in pathways will lead to 
the better understanding of differing disease mecha- 
nisms (4,5). To achieve robust pathway analysis, a com- 
prehensive list of associated loci must be defined. Cur- 
rently, 46 loci have been confirmed to be associated with 
RA susceptibility in Caucasians, at accepted levels of 
genome-wide significance (P < 5 X 10~ 8 ), including 14 
loci newly identified in a recent high-density, fine- 
mapping (ImmunoChip) study of RA (6). 

In studies of inflammatory bowel disease (IBD), 
the findings have become much more informative, im- 
plicating risk pathways that have not been previously 
recognized as important to this disease, and thus increas- 
ing the number of susceptibility markers from 92 to 163. 
The increased number of susceptibility loci for IBD has 
also enabled much more informative investigation of 
disease overlap. Genetic studies of RA to date, albeit 
successful, have not yet delivered validated evidence of 
novel pathways. Moreover, disease overlap studies have 
been limited, thus emphasizing the continuing need for 
discovery of disease susceptibility markers in RA. 

RA is currently divided into 2 groups based on 
serologic subtypes, which are defined according to the 
presence or absence of anti-CCP antibodies, although it 
is still unclear whether there are biologic pathways that 



are common or distinct to each group (2). Determining 
the genetic predisposition to each serologic subtype has 
the potential to better define the mechanism underlying 
each form of disease, enabling progress toward more 
focused clinical management. 

The most recent study aimed at identifying RA 
susceptibility loci (6) used a custom Illumina array 
(ImmunoChip), designed to interrogate 196,524 single- 
nucleotide polymorphisms (SNPs) for 186 loci that have 
been previously shown to be associated with a number of 
autoimmune diseases. The study by Eyre and colleagues 
was the first to be powered to analyze the subgroups of 
seronegative RA and seropositive RA separately. Geno- 
typing in 11,475 RA cases and 15,870 controls provided 
evidence for 14 novel SNPs that achieved genome-wide 
significance, with a further 16 SNPs putatively associated 
with RA in a second tier of significance (P = 5 X 1CT 5 < 
5 X 1CT 8 ), either in an unstratified analysis or in 
stratified analyses of the anti-CCP antibody subgroups. 
We therefore tested the 16 SNPs for which there was 
suggestive evidence of association with RA risk in 2 
independent cohorts, comprising 6,106 RA cases and 
4,290 controls, and performed a meta-analysis in which 
we combined our results with existing data to enhance 
the power to identify associations in the whole data set 
and in subgroups stratified by anti-CCP antibody status. 

PATIENTS AND METHODS 

ImmunoChip discovery cohort. The characteristics of 
the patients in the original ImmunoChip discovery cohort 
(from the UK Rheumatoid Arthritis Consortium International 
[RACI]), as well as the genotyping and quality control proce- 
dures used for the ImmunoChip analysis, have been previously 
described (6). Individual-level data were available for each 
participant, and these data were used in the present analysis. 

Validation cohorts. UK cohort. The UK Rheumatoid 
Arthritis Group (UKRAG) cohort of patients with RA was 
recruited from 6 centers across the UK (Table 1), as previously 
described (7). Genotyping was performed using the Sequenom 
platform. 

US cohort. Samples from patients with RA in the US 
Consortium of Rheumatology Researchers of North America 
(CORRONA) collection and Informatics for Integrating Biol- 
ogy and the Bedside (I2B2) program were also tested using the 
ImmunoChip custom genotyping SNP array (Table 1). Princi- 
pal components analysis was performed using the EigenSoft 
program (version 29) with HapMap phase III samples, to 
exclude individuals of non-European ancestry. To check relat- 
edness between the CORRONA/I2B2 samples and the RACI 
ImmunoChip samples, estimates of identity-by-descent and 
identity-by-state allele-sharing proportions were performed in 
Plink, using a set of SNPs selected for high quality (missingness 
P < 0.002, minor allele frequency [MAF] >0.1) and pruned for 
LD (r 2 < 0.2; samples were removed if the PI_HAT value was 
>0.2 [n = 1 sample per pair]). 
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Table 1. Distribution of rheumatoid arthritis (RA) cases and controls in the ImmunoChip discovery 
cohort and the 2 validation cohorts* 



RA cases Controls 



Collection 


All 


± l nitiii-. 

% 


Anti-PPP 

LI V.C1 

positive 


Anti-PPP 
negative 


All 


1 t ILLillL . 

% 


ImmunoChip 














US 


2,536 


75 


1,803 


593 


2,134 


65 


Swedish EIRA 


2,762 


70 


1,762 


987 


1,940 


73 


Swedish Umea 


852 


70 


524 


242 


963 


69 


Dutch 


648 


66 


330 


301 


2,004 


42 


Spanish 


807 


74 


397 


216 


399 


65 


UK 


3,870 


74 


2,406 


1,000 


8,430 


53 


Total 


11,475 


72 


7,222 


3,339 


15,870 


6L 


Validation 














CORRONA/I2B2 (US) 


2,206 


78 


1,410 


446 


1,863 


75 


UKRAG (UK) 


3,900 


73 


536 


274 


2,427 


74 


Total 


6,106 


76 


1,946 


720 


4,290 


75 


Combined 


17,581 


73 


9,168 


4,059 


20,160 


65 


* Except where indicated otherwise, values are the number of participants. Anti-CCP = anti-cyclic 



citrullinated peptide; EIRA = Epidemiological Investigation of Rheumatoid Arthritis; CORRONA = 
Consortium of Rheumatology Researchers of North America; I2B2 = Informatics for Integrating Biology 
and the Bedside; UKRAG = UK Rheumatoid Arthritis Group. 



SNP selection/prioritization. Sixteen SNPs were se- 
lected for genotyping, based on suggestive evidence of a 
significant association with RA susceptibility (P > 5 X 1(P 8 to 
P < 5 X 10~ 5 ) in either the overall ImmunoChip genotyping 
analysis or in subgroups defined by the presence or absence of 
anti-CCP antibodies. Seven SNPs were selected from the full 
ImmunoChip cohort, 5 from the anti-CCP-positive subgroup, 
and 4 from the anti-CCP-negative subgroup. 

Statistical analysis. Stage one: validation analysis. SNPs 
were included in the validation analysis if they passed quality 
control in both the UKRAG and US cohorts. Samples or SNPs 
with a call rate of <98% were removed. In addition, SNPs that 
did not conform to Hardy-Weinberg equilibrium (P < 5 X 
1CU 7 ) as well as SNPs with differential missingness (P > 0.01) 
or an MAF <0.01 were also removed. After applying all of the 
quality control filters, 4 SNPs (from 944 RA cases [20%] and 
244 controls [9%]) failed quality control. 

We tested for the association of each SNP with RA in 
each study independently, using logistic regression under an 
additive model, and then combined the results in a fixed- 
effects meta-analysis with inverse variance weighting (per- 
formed in Plink). The top 10 principal components were 
incorporated as covariates in the logistic regression analysis of 
the US CORRONA data set, to correct for population strati- 
fication. P values less than 0.05 were considered as evidence of 
a significant association in the validation cohorts. 

Stage two: meta-analysis of ImmunoChip and validation 
data. The association of each SNP with RA in the original 
ImmunoChip analysis and in the validation analysis was tested 
in a fixed-effects meta-analysis with inverse variance weighting 
(performed in Plink). A ^ meta _ analysis threshold of less than 5 X 
10~ 8 was considered significant, i.e., indicative of the discovery 
of a novel RA risk gene. All P values reported are 2-tailed. 
Heterogeneity between studies was assessed using Cochran's 
Q statistic test and the I 2 test. A P value for heterogeneity of 
less than 0.05 is suggestive of heterogeneity. The I 2 test 
estimates the extent of heterogeneity and takes values between 
0% and 100%. I 2 values of 0-25% indicate low heterogeneity, 



25-50% moderate heterogeneity, 50-75% high heterogeneity, 
and 75-100% extreme heterogeneity (8). 

Anti-CCP subset analysis. In separate analyses of the 
data from the validation study and the data from the meta- 
analysis, we compared associations of all SNPs between con- 
trols and either anti-CCP-positive or anti-CCP-negative RA 
patients. In the validation study, there were 1,946 anti-CCP- 
positive RA patients, 720 anti-CCP-negative RA patients, and 
4,290 controls. For the meta-analysis, there were 9,169 anti- 
CCP-positive RA patients, 4,059 anti-CCP-negative RA pa- 
tients, and 20,160 controls. 

Power calculations. All power calculations were under- 
taken using Quanto. The model assumed was a log additive 
model, and population baseline risk estimate was 1%. 

RESULTS 

In the validation analysis, 6 loci showed an asso- 
ciation with RA susceptibility that reached significance 
(P < 0.05): COG6, PVT1, PTPN2, TNFSF4, RAD51B, 
and BACH2 (Table 2). In the meta-analysis, 2 SNPs 
achieved genome-wide significance levels (^meta-analysis 
< 5 X 10~ 8 ): BACH2 in the full meta-analysis (Figure 
1A), and RAD51B in the anti-CCP-positive subgroup 
(Figure IB). These findings in the meta-analysis indicate 
that BACH2 and RAD51B are novel validated RA risk 
alleles. For the remaining SNPs, all effect sizes were in the 
same direction as in the original ImmunoChip analysis. 

In the meta-analysis, 1 SNP, rs7535176, showed 
evidence of high heterogeneity between studies (I 2 = 
57%), and was therefore removed from further analysis. 
In power calculations, the average power to detect the 
likelihood of an association with RA at an OR of 1.12 
(P = 5 X 10~ 8 ) was 81% in the overall validation study, 
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Table 2. Associations of single-nucleotide polymorphisms (SNPs) with rheumatoid arthritis susceptibility in the ImmunoChip discovery cohort, 
validation cohorts, and meta-analysis* 

Meta-analysis 

ImmunoChip Validation 



Anti-CCP+ Anti-CCP- 

Position Minor Study Study Study Study Study Study 



Chr SNP (hg!9) Gene allele P OR Pt OR P OR P OR P OR 



1 


rs61828284:f 


173,299,743 


TNFSF4 


A 


5.5 x 10" 


-6 


0.83 


0.01 


0.81 


1.5 x 10" 


-5 


0.88 


2.1 X 10" 


-7 


0.82 


0.48 




0.97 


2 


rs888427§ 


172,368,120 


CYBRD1 


A 


5.2 x 10" 


-5 


1.12 


0.76 


1.02 


0.00021 




1.06 


0.017 




1.05 


1.6 X 10" 


-4 


1.1 


3 


rsl875463H 


143,165,502 


SLC9A9 


A 


1.5 x 10" 


-6 


0.90 


0.69 


0.99 


2.0 x 10" 


5 


0.92 


1.2 X 10" 


-4 


0.92 


8.7 x 10" 


-5 


0.89 


6 


rs729280381l 


90,976,768 


BACH2 


A 


8.2 x 10" 


-7 


1.13 


0.004 


1.11 


1.1 x 10" 


-8 


1.12 


8.5 x 10" 


-6 


1.12 


2.5 X 10" 


4 


1.13 


7 


rs59472144H 


37,369,908 


ELMOl 


A 


3.5 x 10" 


-6 


1.12 


0.78 


1.01 


5.1 x 10" 


5 


1.09 


9.3 X 10" 


-5 


1.10 


0.04 




1.07 


8 


rs6651252H 


129,567,181 


PVT1 


G 


2.0 x 10" 


-5 


0.89 


0.004 


0.88 


2.4 x 10" 


-7 


0.89 


5.3 X 10" 


-5 


0.89 


0.001 




0.88 


9 


rs7857530§ 


102,747,941 


TXNDC4 


G 


2.5 x 10" 


-5 


0.89 


0.94 


1.00 


0.022 




0.96 


0.40 




0.98 


1.7 x 10" 


-4 


0.91 


13 


rs7993214$ 


40,350,912 


COG6 


A 


4.6 x 10" 


-6 


0.90 


0.05 


0.92 


9.1 x 10" 


-7 


0.92 


6.0 x 10" 


-7 


0.9 


0.15 




0.96 


13 


rsl7230016H 


82,338,338 


SPRY2 


G 


9.6 x 10" 


-6 


0.88 


0.56 


0.97 


4.7 x 10" 


-5 


0.91 


0.003 




0.92 


0.001 




0.88 


14 


rs911263t 


68,753,593 


RAD51B 


G 


2.3 x 10" 


-6 


0.89 


0.005 


0.88 


4.0 x 10" 


5 


0.93 


4.0 x 10" 


-8 


0.89 


0.44 




0.98 


17 


rs8074003H 


76,364,493 


SOCS3 


A 


1.4 x 10" 


-5 


0.81 


0.44 


0.95 


9.2 x 10" 


5 


0.86 


1.9 x 10" 


-3 


0.86 


9.2 x 10" 


-4 


0.80 


18 


rs620978571l 


12,857,758 


PTPN2 


A 


4.5 x 10" 


-6 


1.22 


0.01 


1.20 


1.4 x 10" 


7 


1.22 


3.9 x 10" 


-6 


1.23 


6.2 x 10" 


-5 


1.26 



* Association analyses of SNPs were performed using case and control data from the original ImmunoChip discovery cohort, the validation cohorts 

(UK and US), and the meta-analysis (ImmunoChip and validation cohorts). OR = odds ratio. 

t Validation P value is presented for each SNP in the same stratification as that in the discovery study. 

% Anti-cyclic citrullinated peptide (anti-CCP)-positive RA subgroup. 

§ Anti-CCP-negative RA subgroup. 

11 Full RA group. 



41% in the anti-CCP-positive RA subset, and 6% in the 
anti-CCP-negative RA subset, based on an average 
MAF of 0.19. 

DISCUSSION 

Of the 12 loci assessed for an association with 
susceptibility to RA in the validation study, 2 reached 
accepted genome-wide significance thresholds either in 
the RA group as a whole or in the subset with anti-CCP 
antibodies, indicating that these 2 loci (BACH2, 
RAD51B) represent novel RA susceptibility loci. A 
further 4 SNPs showed evidence for validation of an 
association at a significance level of P < 0.05 (COG6, 
PVT1, PTPN2, and TNFSF4). All of the loci assessed 
showed effect sizes in the same direction as those in the 
original ImmunoChip discovery study. Of the newly 
confirmed RA susceptibility SNPs, rs72928038 maps to 
an intron of BACH2 (6ql5), which has previously been 



associated with type 1 diabetes (9), Crohn's disease (10), 
and celiac disease (11). BACH2 encodes BTB and CNC 
homology 1, basic leucine zipper transcription factor 2, a 
B cell-specific transcription factor that has been shown 
to regulate the BLIMP1 gene (also known as PRDM1, a 
validated RA susceptibility locus) in mice, which in turn 
influences plasma cell differentiation and promotes the 
antibody class switch (12). Rituximab, an effective RA 
treatment, inhibits CD20+ B cells, a cell subtype in 
which BACH2 is highly expressed. 

The second novel SNP associated with RA, 
rs9 11263, was found to be significantly associated 
with RA risk in the anti-CCP-positive subgroup 

(^anti-CCP+ meta-analysis = 4 X 10~ 8 ) but not in the 
anti-CCP-negative SUbgrOUp (P anti . CC P- meta-analysis = 

0.45). This SNP maps to an intron in RAD51B (14q24), 
which has recently been found to be associated with 
primary biliary cirrhosis (13). RAD51B is a gene in- 



A BACH2 <re729280.«8) g RAD51B(n»11263) 

OITXALL — —4 — ' OVERALL < • 

ACTA -v. . «- 1 ACTA -v, • • • 

ACTA— , • ACTA*. i • [ 



• *» IM i ii- i io i.ii i.ja i.ja «n aw a.tl i n Ui 

Figure 1. Forest plots of the association of rheumatoid arthritis (RA) susceptibility with the 2 single-nucleotide polymorphisms showing 
genome-wide significance, BACH2 (A) and RAD51B (B), in the meta-analyses. Results are presented as the odds ratio (OR) with 95% confidence 
interval (95% CI) for the overall meta-analysis and for the subgroups of anti-citrullinated protein antibody-positive (ACPA+ve) RA and 
anti-ACPA-negative (ACPA-ve) RA (determined by anti-cyclic citrullinated peptide test). The meta-analyses are based on a fixed-effects model. 
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volved in the homologous recombination repair pathway 
of double-stranded DNA breaks. It has been demon- 
strated that anti-CCP-positive and anti-CCP-negative 
disease have differing allelic associations at the HLA 
locus, and identification of an association of RAD51B 
with anti-CCP-positive RA may add to the list of genes 
that differentiate between the disease subgroups, al- 
though this could not be formally proven in the present 
study because of the reduced power in the anti-CCP- 
negative cohort. 

An association of 4 loci (COG6, PVT1, PTPN2, 
and TNFSF4) with RA risk was replicated in the valida- 
tion study, at P < 0.05, but none reached genome-wide 
significance in the overall analysis. Given that these loci 
are associated with multiple autoimmune diseases, there 
is strong a priori evidence that these are likely to be RA 
susceptibility loci. Indeed, an association of RA risk with 
PTPN2 has now reached genome-wide significance lev- 
els, as demonstrated in an expanded, independent ana- 
lysis in a European cohort (14). 

The number of SNPs with suggestive evidence 
from the ImmunoChip analysis suggests that there are 
many additional undiscovered risk alleles with modest 
effect sizes that have yet to be formally confirmed at 
genome-wide significance levels of association. Power 
calculations showed the necessity of larger sample sizes 
to detect the more modest effect sizes detected in this 
second tier of significance, indicating that there is still 
great value in increasing study size to identify novel 
susceptibility loci. Indeed, combined sample sizes of 
more than 75,000 cases and controls were required to 
identify the 163 loci now confirmed to be associated with 
IBD. Sample size and power therefore remain a limita- 
tion in our current study, in particular with respect to the 
lack of data on the ACPA status of patients in the 
validation cohorts. A further limitation in this study was 
the targeted SNP genotyping, chosen as a cost-effective 
strategy to confirm putative associations. By adopting 
this strategy, our capacity to fine map the newly identi- 
fied loci was restricted to data generated in the Immu- 
noChip experiment. 

In conclusion, we obtained convincing evidence 
of 2 new RA susceptibility loci, BACH2 and RAD51B, in 
a combined meta-analysis of 17,581 cases and 20,160 
controls, bringing the total number of novel RA suscep- 
tibility loci identified in Caucasians to 48. These findings 
implicate 2 distinct biologic pathways, those of B cell 
differentiation and DNA repair, in serologic subtypes of 
RA. Identification of all genetic variants that predispose 
to RA will increase our understanding of the molecular 
mechanisms involved and better annotate biologic path- 
way analyses. 
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